DRAFT Automatic Creation of Lexical Knowledge Bases : New Developments in Computational

نویسنده

  • Kenneth C. Litkowski
چکیده

Text processing technologies require increasing amounts of information about words and phrases to cope with the massive amounts of textual material available today. Information retrieval search engines provide greater and greater coverage, but do not provide a capability for identifying the specific content that is sought. Greater reliance is placed on natural language processing (NLP) technologies, which, in turn, are placing an increasing reliance on semantic information in addition to syntactic information about lexical items. The structure and content of lexical entries has been increasing rapidly to meet these needs, but obtaining the necessary information for these lexical knowledge bases (LKBs) is a major problem. Computational lexicology, which began in somewhat halting attempts to extract lexical information from machine-readable dictionaries (MRDs) for use in NLP, is seeing the emergence of new techniques that offer considerable promise for populating and organizing LKBs. Many of these techniques involve computations within the LKBs themselves to create, propagate, and organize the lexical information. 1 Introduction Computational lexicology began in the late 1960s and 1970s with attempts to extract lexical information from machine-readable dictionaries (MRDs) for use in natural language processing (NLP), primarily in extracting hierarchies of verbs and nouns. During the 1980s, NLP began reaching beyond syntactic information with a greater reliance on semantic information, locating this information within the lexicon. After reaching a conclusion (in the early 1990s) that insufficient information could be obtained about lexical items from MRDs, new techniques have emerged to offer considerable promise for populating and organizing lexical knowledge bases (LKBs). An underlying reason for the realization of these techniques seems to be the increasing capability to deal with the large amount of data that must be digested to deal with the overall content and complexity of semantics. This discussion begins with the assumptions about large amounts of information in lexical entries and particular computations made with this information in NLP. From this starting point, the paper describes emerging techniques for populating and propagating information to lexical entries derived from existing information with the LKB. The primary motivations for extending lexical entries comes from a need to provide greater internal

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Tool For The Automatic Creation, Extension And Updating Of Lexical Knowledge Bases

A tool is described which helps in the creation, extension and updating of lexical knowledge bases (LKBs). Two levels of representation are distinguished: a static storage level and a dynamic knowledge level. The latter is an object-oriented environment containing linguistic and lexicographic knowledge. At the knowledge level, constructors and filters can be defined. Constructors are objects wh...

متن کامل

Automatic Thesaurus Generation from Raw Text using Knowledge-Poor Techniques

In addition to showing how lexical units are related within a eld, domain-speciic thesauri give an idea of what subjects are important to that eld and are thus useful at many points in an information system. The major impediment to creation of thesauri has been the cost of their manual creation. We present here a number of automatic techniques that jointly produce a rst draft of a thesaurus fro...

متن کامل

A Survey on Portuguese Lexical Knowledge Bases: Contents, Comparison and Combination

In the last decade, several lexical-semantic knowledge bases (LKBs) were developed for Portuguese, by different teams and following different approaches. Most of them are open and freely available for the community. Those LKBs are briefly analysed here, with a focus on size, structure, and overlapping contents. However, we go further and exploit all of the analysed LKBs in the creation of new L...

متن کامل

Acquiring Semantic Information in the TCL’s Computational Lexicon

Ontologies are the central component for the Semantic Web, since they can be used to explicitly represent the semantics of structured or semi-structured information. In this paper, we describe the recent developments of a lexical ontology named the TCL’s computational lexicon, which aims to serve as the core knowledge base for the Semantic Web. We focus on designing a new specification of the s...

متن کامل

A Constraint-Based Approach for Computational Lexicon Construction

Ontologies are the central component for the Semantic Web, since they can be used to explicitly represent the semantics of structured or semistructured information. In this paper, we describe the recent developments of a lexical ontology named the TCL's computational lexicon, which aims to serve as the core knowledge base for the Semantic Web. We focus on designing a new specification of the se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997